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DEMONSTRATION OF FORMULA FOR TRUE 
MEASUREMENT OF CORRELATION. 



By C. Spearman. 



It seems daily more evident, that one of the most important 
tasks awaiting psychologists is the accurate measurement of 
the 'correlations' (£. e., the tendencies to concurrent variation) 
between psychical events, qualities, faculties or other charac- 
teristics. For this purpose a now well-known method of cal- 
culation has been evolved by Bravais, Galton and Pearson, 
which furnishes a numerical ' coefficient ' measuring precisely 
the degree of proportionality between any two series of values; 
this coefficient is usually denoted by the symbol r. 1 Should 
on any occasion the correlation between the two series take 
some more complicated form than that of simple proportion, 
then r has to be supplemented by further terms to express such 
correlation completely; r remains, however, the principal term 
(with or without some unessential modification of outward 
shape). 

Unfortunately, there is a considerable step between arriving 
at a coefficient of correlation and discovering the true coefficient. 
In the first place, the values immediately attainable by inves- 
tigation are not those of the characteristics really investigated, 
but only those of measurements, and — in the case of psychol- 
ogy — for the most part very fallible measurements. Secondly, 
it is usually quite impossible to keep the investigation clear 
of many factors that do not properly belong to it. The 
actual effect of these two disturbances, the observational errors 

'For method of calculation see this Journal, 1904, XV, pp. 77-8 (but 
for "median" substitute "average"). 
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and the irrelevant factors, is not merely to diminish somewhat 
the accuracy of the calculation, but to render the apparent cor- 
relation (whether calculated or merely casually inspected) 
wholly untrustworthy. A large correlation may be obliter- 
ated ; an illusive one may be conjured up where none exists 
really ; it may even happen that a positive correlation is turned 
into an apparently negative one, or vice versa. 

Now, two formulae were given by me in this Journal some 
time ago, 1 whereby the effect of both these disturbances can, as 
I believe, be eliminated. The formulae were, however, not 
accompanied by any proofs. So many other mathematical for- 
mulae were given at the same time, that the formal demon- 
strations of them all would have made the article exceedingly 
cumbersome. 3 But since then I have repeatedly been asked 
for these proofs; some mathematicians have gone so far, as to 
doubt whether such proofs could possibly be valid. It there- 
fore seems advisable to publish them. 8 

For convenience of demonstration, I will commence with the 
formula for eliminating irrelevant factors, although in applica- 
tion the other formula must be used first (to all the coefficients 
entering into the former formula). 

i . Proof of the formula for eliminating the effect of irrelevant 
factors? 

I,et X, Y, and Z denote the values of any three variable 
and correlated characteristics of objects of any particular class 
(for instance, their height, length and breadth). Let their 
average values be denoted by a x , a y and a z respectively; let 
a x — X = x, a y — Y=y, and a z — Z = z. Further, let b xy 
and b xz be such values, that 2 [x — (b xy y -f- b xz z) ] 2 is a 
minimum. Equating to o the differentials of this sum with 
respect to both b xy and b xz , and solving these two equations 
for b xy , we find 



b 



Sxy. Sz 2 — 2xz. Syz __ r xy — r xz . r yz VSx 2 



*y 3y*.2z' — (Syz)* ' " ' i — r yz V%T~ 

J Vol. XV, 1904, pp. 88-96. 

2 As an example, it may be mentioned that another formula given in 
the same article has just had to be demonstrated also. The simple 
statement of values, as originally given in this Journal, took up only 
a couple of lines; whereas the mathematical demonstration (in the 
British J. of Psych.) fully occupies three pages. 

8 These formulae are again much utilized in a paper of Professor 
Krueger and myself in the Zeitschrift fiir Psychologie (44, p. 48). 

4 Parts of the printer's proofs have very kindly been looked over by 
Drs. Herglotz and Carath6odory (lecturers on mathematics, Univer- 
sity of Gottingen) and Dr. Ehrenfest, who have furnished valuable 
criticisms. 
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where r xy denotes Pearson's correlational coefficient 

2xy 
VSx 2 . 2y 2 , 
and r xz and r yz have similar meanings. 

a 1 1 u r *y — r « • r y* ^2 y a 

Analogously b yx = ;_ y — • ' ( so that 



VSx 



V b X y . by x 



V d-rL) ( 1 ~ r vz) (a)X 



v ^y k • b y*k = ./„ i 5 F = r *yk • ( b ) 



Let us next suppose our whole class of objects to be split 
up into groups, a group embracing all those objects for which 
Z has any constant value. Let us apply the considerations and 
notation of the preceding paragraph to the k* such group 
apart. The term b xz vanishes, since z clearly = o. 

We get, then, 2(x — b xy y) 2 = a minimum, from which 

* S (x _ bxyk y y = o, so that b xyk =1^ 
a D *yk -* yk 

As b y x k has an analogous value, 

Sx y k 

^2 x k . a y£ 

Now, in general the value of b xy differs from that of b xy ; 

but in three special cases here interesting us it can be shown 
to coincide. 

The first case occurs when the following assumptions are 
permissible : 

(c) 
(d) 
(e) 
(f) 

ive all 
the objects as represented geometrically by positions having 
as ordinates X, Y and Z. The equations x — b xyk y = o evi- 
dently denote what may be called 'minimal lines,' b xyfc being 
determined for each value of k by the relation 2 (x — b xyk y) 2 
= a minimum ; and all such minima may be regarded as part 

1 This result was reached, in a somewhat different manner, by Yule 
(Proc. R. Soc, Vol. LX). 
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sums of a total minimum which, transforming from x and y to 
X and Y, may be written as 

2 2 [X - b xyk Y - (a Xk - b xyk a yk ) ] \ 

k 

or, shorter, as 2 2 (X — bxy k Y — Ek ) 2 , or simply as 2 m. 

k 

On the other hand, x — b xy y — b^z = is a 'total min- 
imal plane,' bx y and b xz being determined by the relation 
2 (x — b xy y — b xz x) 2 = a minimum ; and this minimum 
can be regarded as made up of part sums (not necessarily min- 
ima individually), one for each different value of Z, and there- 
fore may be written as 

2 2 [X — b xy Y - (b xz Z k + a x — b xy a y — b«a z ) ] 2 , 

k 

or, shorter, as 2 k 2 (X — b xy Y — E k ) 2 , 

or simply as M. Evidently, the difference between 2 m and M 
depends on that between Ek and E k . But the former is the 
value that makes every part sum of the form in question a 
minimum, as may readily be found by determining E from the 

equation— 4r(X — bY— E) 2 = o. Hence, if E k differs 

from Ek for any single value of k, the corresponding part sum 
of M becomes greater than that of 2m. A fortiori, the total 
effect of all differences between E k and Ek for all values of k 
must be to make M greater than 2m. But in the present case 
it happens to be possible for all corresponding parts of 2m and 
M respectively to be identical ; for the 'minimal lines,' from 
which 2m derives, must — owing to the conditions (c), (e) and 
(f ) — lie in one plane ; and there is no condition preventing 
this plane from coinciding with the 'total minimal plane,' from 
which M derives. Since M can wholly coincide with 2m, it 
must do so, for otherwise it would not be a minimum, as re- 
quired by hypothesis. Hence finally, b xy = b xyk and, taking 
into consideration the equations (a) and (b), we arrive at the 
desired result, 



3*xz • i yz 



r * y k — */ r* _■** w < SiTV (g) 



The second case is where equation (f ) is assumed once more 
and also b xyo = b xyi = b xy2 = . . . = b xyk = . . . = o. (h) 

It is clear that all the minimal lines must again lie in one plane, 
and this can again be shown, by reasoning as above, to coincide 
necessarily with the 'total minimal plane.' Consequently b xy 
= b xyk = o. Hence equation (g) again holds good, either 
side of it now being equal to o. 
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The third case is similar to the first, with the exception, that 
b xyk is permitted to vary with k. Under these conditions the 
equation (g) will be found valid to the extent of giving the 
most probable mean value of r xyk . 

In equation (g) we have reached at any rate the form of the 
formula for eliminating irrelevant factors. But now we have 
to see whether, also, the constituent terms of this equation 
correspond to the respective values concerned in the actual 
investigation of correlation. I^et us therefore turn to actuality, 
say, to the correlation found in my previous paper between 
children's power of discriminating pitch and that of discriminat- 
ing weight. As irrelevant factor let us take the children's vary- 
ing age. Clearly, the actually observed correlational coefficients 
are derived from three variables, discrimination of pitch, dis- 
crimination of weight, and age; they correspond perfectly to 
r* y , r xz and r yz on the right of equation (g) . But the factor of age 
is obviously irrelevant and disturbing; suppose, for instance, that 
both discrimination of pitch and that of weight improved with 
age; then of course we should find a correlation between the 
two sorts of discrimination, for both would be best in the same 
children, namely, the oldest; such correlation is evidently be- 
side the question. Our true course of investigation would 
have been to select for experiment children of exactly the same 
age; but this is precisely the way we arrived at r xyk : thus r xyt 
is the desired true correlation between x and y. 

It remains to be considered how far the actual observational 
material fulfils the special conditions under which alone the 
formula (g) has been shown to be exact. The present topic, 
that is, the elimination of irrelevant factors, comes under our 
above 'first case;' we need, therefore, the relations, (c), (d), 
(e) and (f ). Of these the two latter are equivalent to requir- 
ing the correlations of the irrelevant term (here, age) with the 
two main terms (here, discrimination of pitch and of weight) 
to be linear. This limitation is less serious than might be sup- 
posed ; for the inexactness of the corrective formula only be- 
comes appreciable when the irrelevant correlations depart from 
the linear form very largely; and experience has shown such 
large deviations to be extremely rare. Anyhow, it can scarcely 
be a matter of surprise that irrelevant correlations become diffi- 
cult to treat when non-linear, seeing that no quite satisfactory 
formulae have yet been discovered even for the bare measure- 
ment (z. e. antecedently to all corrections) of the main correla- 
tion when non-linear. 

Finally, the two other conditions (c) and (d), mean that the 
true correlation must not change appreciably for the different 
values of the irrelevant term. Now, such changes may be taken 
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as, in general, of a smaller order of magnitude than the altera- 
tions to be eliminated by the corrective formula, those produced 
by mixing different values of the irrelevant term. To return 
to our instance, there is good reason to believe that most cor- 
relations are very similar for children of 9 years as for those of 
12, although the gravest disturbance will often occur if 9 and 
12 years be thrown together into one and the same correlation. 
Conditions (c) and (d) may therefore be considered sufficiently 
satisfied whenever the irrelevant influence to be eliminated is 
of moderate amount. But if, instead of confining our experi- 
ments to the ages of 9-12, we had included those down to, say, 
5, then the true correlation for 5 years would probably have 
had quite a large discrepancy from that for 12. In such case 
one could at most expect any general 'true' correlation to sig- 
nify the true mean correlation; and this, as we have seen, is 
the value actually given by our formula. 

2. Proof 0/ the formula for eliminating the effect of inaccurate 
observation. 

We will assume any two correlated series of values, X and Y 
to have each been measured twice independently, and to have 
yielded the series of measurements n z , x 2 , y T and y 2 . The co- 
efficients r X]y „ r x , y2 , r xm , r X2y? , r XlX2 and r yiy2 can, of course, be 
reckoned directly. We require a formula to reckon r X y. 

Let us first consider the correlations between x t , x 2 and X, 
and see how the coefficient between x x and x 2 becomes modi- 
fied when a separate calculation is made for each group of ob- 
jects for which X is constant, say, = X k . We may fairly as- 
sume the average of all the measurements x 2k (or x Ifc ) to co- 
incide with Xk, or at any rate to vary proportionally thereto 
for the different values of k; hereby the condition (f ) is satis- 
fied. Further, when we consider any k th group quite apart, 
since the fluctuations in the two series of measurements Xi k 
and Xj k of the same value Xk are by hypothesis independent 
of one another, b XlXllk (or b X2Xlk ) always = o, thus satisfying 

the condition (h). Consequently, we have here our 'second 
case' and 

r XlX2 — r XlX . r Xg x 

r * lX2k - V( 1 _ r* XlX ) ( 1 - rV) ~~ °' W 

where X is fixed at Xk on the left side of the equation, but 
remains variable on the right. Therefore, since r cannot be 
infinitely great, 

r XlX2 = r XlX • r X2 x, and analogously (j) 

r yiy 2 = r yiY • r y2 Y (k) 
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Next, let us consider the correlations between x lt X and y x . 
If we again make a separate calculation for each group of val- 
ues for which X is constant, the condition (f ) is satisfied just 
as before. Further, in the calculation for any k th group con- 
sidered quite apart, Xk, being a constant, is independent of 
the fluctuations in the measurements yi k , so that b y , x k = o 
and condition (h) is satisfied again. Hence our 'second case' 
occurs once more and 

r r Xiyi — r XlX ■ r yiX _ 

iyik ~ V71 ^ STr ? ^ "' 

^ (. * — r x,x; (.1 — r yiX ; 

From this evidently 

r y , x = -5-S and, analogously, = ^S2i. (m) 

r Xl x r X2 x 

Likewise r y2 x = ^- 2 = ^JL 2 , (n) 

r x , x r X2 x 

r^ Y = !az! = !fis, (o) 

r y , y r ya y 

and r X2 Y = ^ = ^a. (p) 

r yi y r y2 y 

Finally, let us take the correlations between x x , X and Y. 
By reasoning as before, we get 

r Xl Y k = r XlY -r XlX .r XY = Qj 

V( i-r 2 XlX ) (1— r 2 XY ) 
from which it is evident that 

rx y = -S^ and analogously, 
r Xl x 

_ r x 2 Y __ r yiX __ r ygX 

r X2 x r yi y r y 2 y 

By multiplying together the four preceding equations to r x y, we 
have 

r X Y 4 = r *i Y • r *» Y • r yix • Ty 2 x. 
r x ,x • r X2 x . r y ,Y • r y2 Y 

Substituting on the right of the above equation from (j), (k), 
(m), (n), (o), and (p) and taking the real positive root, we 
find at last 

r X Y — _ ( r *iyi ■ r *iy 2 ; r *2yi ■ T ^y^) ( q \ 

l* \ r xi X2 > r y , y2 ) 
where G denotes the geometrical mean. 
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In practice it will usually be allowable to assume that the 
two series of measurements of the same series of things have 
been conducted with equal accuracy. Then r Xl x = r X2 x and 
r y , Y = r y2 Y, so that equation (q) becomes 

r xy ( = Tijy, = r x ,y 2 = r X2 y, = r l2y2 ) 



rx Y 



*-* ( r iiis > r yiy2/ 



The discrepancies that will occur between the four actual 
values of r x y must then be attributed to mere chance, and must 
be met, as usual, by taking an average. Thus we get, on the 
assumption of the equally accurate series of measurements, 

r *t yi ">" r iiy 8 "■" ** %2yi "r* rx 2 y 2 

v r x , Xa . r y , y 2 

The above proof of the formula for eliminating the effect of 
observational errors is, as we have seen, exact and perfectly 
general. It holds good whatever may be the distribution of 
values, or the size or distribution of the observational errors, 
in the series concerned and whatever may be the correlation's 
form. Any discrepancies arise solely from the practical neces- 
sity of applying the formula, not to the whole series of values 
considered, but only to 'random samples' of such series. By 
sufficiently extending the experiments, the chance of discrep- 
ancy may be reduced as much as desired. The formula for 
irrelevant factors is equally general, except for the two limita- 
tions explained above. 

Both formulae concern themselves, however, solely with r, 

Sxy 
that is, with . =• But, as mentioned before, when the 

V Sx'' . Sy 

relation between the two characteristics investigated assumes 
some special form, instead of the normal 'linear' form of simple 
proportion, then this special form finds no expression whatever 
in r taken alone. To express this form analytically, other addi- 
tional terms are required. The exact nature of these additional 
terms (as well as the outward shape of r itself) varies some- 
what according to the method of calculation adopted. But 
there is, probably, no mathematical difficulty in devising modi- 
fications of our corrective formulae to suit any such terms. 

It should be observed, that in many cases the non-linear 
form is more apparent than real. Generally speaking, a mere 
tendency of two characteristics to vary concurrently must be 
taken, it seems to me, as the effect of some particular underly- 
ing strict law (or laws) partly neutralized by a multitude of 
'casual' disturbing influences. The quantity of a correlation 
is neither more nor less than the relative influence of the un- 
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derlying law in question as compared with the total of all the 
influences in play. Now, it may easily happen, that the under- 
lying law is one of simple proportionality but the disturbing 
influences become greater when the correlated characteristics 
are larger (or smaller, as the case may be). Then the underly- 
ing simple proportionality will not appear on the surface; the 
correlation will seem non-linear. Under such circumstances, 
r cannot, it is true, express these variations in the quantity of 
correlation; it continues, however, to express completely the 
mean quantity of correlation. 

In the majority of the remaining cases of non-linearity, the 
latter is merely due to a wrong choice of the correlated terms. 
For instance, the correlation between the length of the skull 
and the weight of the brain must, obviously, be very far from 
linear. But linearity is at once restored (supposing all the 
skulls to belong to one type) if we change the second term 
from the brain's weight to the cube root of the weight. 

To conclude, even when the underlying law itself really has a 
special non-linear form, although r by itself reveals nothing of 
this form, it nevertheless still gives (except in a few extreme 
and readily noticeable cases) a fairly approximate measure of 
the correlation's quantity. 1 

1 Several writers, who have made otherwise valuable contributions 
to the subject of correlation, but have been too exclusively guided by 
the purely mathematical point of view, appear to have wholly over- 
looked this fundamental distinction between the form and the quan- 
tity of a correlation. 



